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Abstract 

Mounting evidence of climatic effects on riverine environments and adaptive 
responses of fishes have elicited growing conservation concerns. Measures to 
rectify population declines include assessment of local extinction risk, population 
ecology, viability, and genetic differentiation. While conservation planning has 
been largely informed by neutral genetic structure, there has been a dearth of 
critical information regarding the role of non-neutral or functional genetic varia- 
tion. We evaluated genetic variation among steelhead trout of the Columbia 
River Basin, which supports diverse populations distributed among dynamic 
landscapes. We categorized 188 SNP loci as either putatively neutral or candi- 
dates for divergent selection (non-neutral) using a multitest association 
approach. Neutral variation distinguished lineages and defined broad-scale popu- 
lation structure consistent with previous studies, but fine-scale resolution was 
also detected at levels not previously observed. Within distinct coastal and inland 
lineages, we identified nine and 22 candidate loci commonly associated with pre- 
cipitation or temperature variables and putatively under divergent selection. 
Observed patterns of non-neutral variation suggest overall climate is likely to 
shape local adaptation (e.g., potential rapid evolution) of steelhead trout in the 
Columbia River region. Broad geographic patterns of neutral and non-neutral 
variation demonstrated here can be used to accommodate priorities for regional 
management and inform long-term conservation of this species. 



Introduction 

Management strategies implemented for species conserva- 
tion are highly contingent on a host of correlated life 
history and demographic information. In concert with 
these data, genetic structure is vital for characterizing 
population distinctions and limitations on productivity 
related to the decline of many species (e.g., conifer Cathaya 
argyrophylla pineceae, Ge et al. 1998; Chinese cobra Naja 
atra, Lin et al. 2012; gopher tortoise Gopherus polyphenols, 
Clostio et al. 2012; Chinook salmon, Moran et al. 2013). 
The genetic differentiation of populations across a species 
range is often determined on the basis of phylogenetic ori- 
gins (Wagner et al. 2005), and historical and contemporary 
demography (Ruokonen et al. 2004). More recently, 



genetic variation has been viewed from the perspective of 
physical landscapes or environmental variation (Manel 
et al. 2003; Schoville et al. 2012). A landscape genetics 
approach reveals population variation relative to the 
influences or features in an organism's environment 
(Segelbacher et al. 2010; Sepulveda-Villet and Stepien 
2012), including natural or human erected barriers and 
local climate. Most often it has been described on the basis 
of neutral divergence (Dionne et al. 2008; Narum et al. 
2008), where restricted gene flow is explained in the con- 
text of a heterogeneous, patchwork environment (Latch 
et al. 2011). 

Conservation units, such as a distinct population 
segment (DPS), are established based on a core set of 
criteria including population ecology and viability, 
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ancestry and descent, reproductive isolation, and local 
adaptation (Fraser and Bernatchez 2001; Fraser et al. 
2011). Local adaptation may be inferred from neutral 
genetic structure coincident with habitat or life-history 
variability (Olsen et al. 2011; Blankenship et al. 2011). 
However, direct evaluations of non-neutral population 
differentiation (i.e., putatively adaptive divergence) are 
likely to reveal stronger, more correlative relationships 
(Limborg et al. 2011). Nevertheless, conservation is not 
commonly informed by non-neutral variation, and 
inferences on adaptation based exclusively on neutral 
differentiation risk incorrectly identifying the underlying 
processes affecting population distinctions (Funk et al. 
2012; Landguth and Balkenhol 2012). 

Although full genome sequence data are typically not 
available for nonmodel species, evaluations of adaptive 
variation have recently been addressed using analyses of 
single nucleotide polymorphism (SNP) loci (Willing 
et al. 2010; Matala et al. 2011; Hohenlohe et al. 2010). 
Because SNP loci are commonly found within or 
adjacent to coding and regulatory regions of a genome, 
their allele frequencies may be influenced by selection 
(i.e., non-neutral). Techniques, such as association 
testing and detection of outlier loci, allow evaluation of 
differentiation that provides an improved understanding 
(over neutral loci) of the relationship between signatures 
of adaptive variation and the physical environment, even 
without direct interpretations of phenotypic variation, or 
interrogation of specific functional genes (Narum et al. 
2010b; Matala et al. 2011; Ackerman et al. 2012; Bourret 
et al. 2013). 

Understanding the distribution of adaptive variation 
across landscapes will be crucial in establishing conserva- 
tion priorities (see Crandall et al. 2000; Fraser and Bernat- 
chez 2001), and for anticipating how populations might be 
affected by local and regional changes in climate (Holde- 
regger and Wagner 2008; Isaak et al. 2012a). The effects of 
global climate changes (e.g., rising temperatures) have 
increasingly altered habitats of myriad species, garnering 
the attention of a broad spectrum of researchers (Hickling 
et al. 2005; Milner et al. 2008; Winfield et al. 2010). Cli- 
mate changes can prompt organisms to alter their behavior 
through range expansions (Loarie et al. 2009), or adapt 
over short time periods (rapid evolution; Hoffmann and 
Sgro 2011; Barrett et al. 2010). Some of the most profound 
examples are found among organisms limited by confined 
habitats such as those of fishes, whose habitats are 
prescribed by water routes (e.g., networks of streams and 
lakes). Because of this limitation, they are particularly sus- 
ceptible to environmental changes including water quality 
and temperature (Hari et al. 2006; Rieman et al. 2007; 
Wenger et al. 2011). Fish adapt variably to thermal condi- 
tions in their migratory environment (thermal optimum 
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for aerobic scope), and even populations within the same 
subbasin may be affected disproportionately by dramatic 
thermal shifts (Farrell et al. 2008). 

In the Columbia River Basin (CRB), steelhead trout (On- 
corhynchus mykiss) occur as two evolutionarily divergent 
lineages, delineated east (inland) and west (coastal) of the 
Cascade Mountain Crest. The inland redband trout 
(O. m. gairdneri) are typically a stream maturing, summer- 
run type, while coastal rainbow trout (O. m. irideus) are 
dominated by an ocean maturing, winter-run type (Busby 
et al. 1996; Behnke 2002; Currens et al. 2009; Blankenship 
et al. 2011). Some populations of O. m. irideus also have a 
summer-run life history, though not necessarily genetically 
differentiated from sympatric winter-run populations 
(Busby et al. 1996). Owing to persistent steelhead trout 
population declines throughout the region, managers have 
implemented extensive monitoring and evaluation efforts 
(Busby et al. 1996; Chilcote 1998; ICTRT 2003; Scott 2008; 
Fryer et al. 2012). Five steelhead trout DPSs have been 
delineated within the CRB, and each is currently recognized 
for protection under the Endangered Species Act (ESA): 
the Upper Willamette River, Lower Columbia River, Mid- 
dle Columbia River, Upper Columbia River, and the Snake 
River Basin (U.S. Office of the Federal Register 2006). Riv- 
ers in proximity to the Columbia River estuary lie within 
the Southwest Washington DPS. These conservation 
demarcations are largely contingent on adjacency of water- 
sheds within stream networks, coupled with life history dis- 
tinctions, and neutral genetic population structure. To 
date, there has been little to no direct interpretation of 
non-neutral variation for conservation assessment (Bea- 
cham et al. 1999; ICTRT 2003; Good et al. 2005; USOFR 
2006; Nielsen et al. 2009). 

Studies that describe distinctions among particular steel- 
head trout populations are plentiful (Chilcote et al. 1986; 
Zimmerman and Reeves 2000; Hendry et al. 2002; Matala 
et al. 2008), and address some local conservation concerns. 
However, more extensive evaluations are necessary to 
accommodate broad conservation management priorities 
in a regional context (Beacham et al. 1999; Winans et al. 
2004; Currens et al. 2009; Blankenship et al. 2011). In this 
study, we investigate patterns of neutral genetic variation 
in contrast to non-neutral (putatively adaptive) genetic 
variation. We employed a multi-phased test approach to 
categorize non-neutrality (selection candidacy) of loci that 
was procedurally similar to several previous studies (Na- 
rum et al. 2010b; Hess and Narum 2011; Matala et al. 
2011). Our primary objective was to provide an extensive 
characterization of genetic variation of steelhead trout 
throughout the entire Columbia River drainage to inform 
conservation management of this species. Study questions 
are threefold: (i) Is neutral population structure using 
SNPs consistent with previous studies that utilized other 
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marker types [Winans et al. 2004; Currens et al. 2009; 
Blankenship et al. 2011]?, (ii) Is there significant evidence 
for candidate SNPs under selection and indications of 
putative adaptive variation?, and (iii) How do patterns of 
neutral and non-neutral genetic variation compare and 
contrast among populations that occupy highly variable 
environments across the landscape. Lastly, we discuss how 
non-neutral differentiation in relation to climate change 
may improve our understanding of population viability, 
and promote informed conservation that will complement 
existing methods implemented for the practical manage- 
ment of many diverse and often imperiled species. 

Methods 

Sampling, genotyping, and descriptive statistics 

Our final data set consisted of 9011 steelhead trout sam- 
ples, representing 145 collections spanning the sampling 
years 1996-2011 (Table 1; Table SI). Collections will 
hereafter be referred to as populations. The data set was 
primarily comprised of natural-origin populations 
(n = 133), but some hatchery exceptions are included 
(n = 12). Both the coastal and inland lineages were 



represented but in skewed numbers of populations: 
coastal (« = 24), and inland (n = 121; Table SI; Fig. 1). 
Sample sizes genotyped per population and by major 
population group (MPG) had minimum numbers rang- 
ing from n = 18 to n = 90, and maximums ranging from 
n = 30 to n = 164; Table 1). Genomic DNA was 
extracted from fin or opercle tissues of juvenile and adult 
fish preserved dry on Whatman paper (LaHood et al. 
2008) or stored in individual vials containing 100% 
nondenatured ethanol. For DNA extraction, we used a 
standard Qiagen® DNeasyTM protocol, or NexttecTM 
Genomic DNA Isolation Kits from XpressBio (Thurmont, 
MD, USA) following the manufacturer's standard 
protocol. 

A total of 191 SNP loci were genotyped with Taqman 
assays (Applied Biosystems, Grand Island, NY, USA). Our 
locus panels were comprised of SNPs developed by multi- 
ple sources (Table S2). All loci were ascertained from a 
broad coast-wide sample of populations, including Alaska, 
Washington, British Columbia, California, and Russia. 
Most SNPs were developed in a process of mining 
expressed sequence tags (ESTs) from GeneBank (https:// 
www.ncbi.nlm.nih.gov/genbank/), followed by sequencing 



Table 1. Summary of O. mykiss populations, sample size, and characteristics by distinct population segment (DPS) and MPG in the Columbia River 
Basin. All listed DPSs have an ESA status listing of threatened with the exception of the Southwest Washington DPS, which also includes the Quinault 
River population from the coast of Washington State. 



DPS 


Tributary/Region 


MPG 


# pops 


Lineage 


Run 


Genotyped (n) 
Min Max 


Mean 


Total 


Southwest WA 


Columbia Estuary 


n/a 


4 


Coastal 


Winter 


43 


164 


85.8 


343 


Lower Columbia 


Clackamas 


Cascade 


4 


Coastal 


Mixed 


43 


92 


60.3 


241 


Lower Columbia 


Other 


Cascade; Gorge 


9 


Coastal 


Mixed 


28 


94 


68.1 


613 


Upper Willamette 


Willamette 


Willamette 


3 


Coastal 


Mixed 


39 


93 


60.7 


182 


Upper Willamette 


Willamette/West Side 


Willamette 


3 


Coastal 


Winter 


25 


30 


27.0 


81 


Middle Columbia 


Klickitat 


Cascade East Slope 


10 


Inland 


Summer 


33 


48 


43.4 


434 


Middle Columbia 


Deschutes 


Cascade East Slope 


5 


Inland 


Summer 


31 


63 


51.4 


257 


Middle Columbia 


John Day 


John Day 


10 


Inland 


Summer 


18 


107 


36.3 


363 


Middle Columbia 


Yakima 


Yakima 


7 


Inland 


Summer 


21 


59 


36.9 


258 


Middle Columbia 


Other 


Mixed 


7 


Inland 


Mixed 


34 


148 


100.1 


701 


Upper Columbia 


Wenatchee 


Upper Columbia/East Slope Cascades 


6 


Inland 


Summer 


19 


99 


40.8 


245 


Upper Columbia 


Other 


Upper Columbia/East Slope Cascades 


5 


Inland 


Summer 


90 


99 


94.3 


475 


Snake 


Lower Snake 


Lower Snake 


6 


Inland 


Summer 


49 


105 


83.5 


501 


Snake 


Lower Clearwater 


Clearwater 


5 


Inland 


Summer 


49 


156 


107.8 


539 


Snake 


M. F. Clearwater 


Clearwater 


14 


Inland 


Summer 


35 


99 


56.4 


789 


Snake 


S. F. Clearwater 


Clearwater 


4 


Inland 


Summer 


36 


104 


57.8 


231 


Snake 


Grande Ronde 


Grande Ronde 


7 


Inland 


Summer 


45 


95 


62.7 


439 


Snake 


Imnaha 


Imnaha 


4 


Inland 


Summer 


23 


61 


41.5 


166 


Snake 


S. F. Salmon 


Salmon 


4 


Inland 


Summer 


39 


45 


43.5 


174 


Snake 


M. F. Salmon 


Salmon 


8 


Inland 


Summer 


23 


84 


48.3 


386 


Snake 


Upper Salmon 


Salmon 


7 


Inland 


Summer 


37 


117 


83.1 


582 


Snake 


Other Salmon 


Salmon 


7 


Inland 


Summer 


43 


99 


55.1 


386 


Snake 


Hatchery 


Mixed 


6 


Inland 


Summer 


89 


146 


104.2 


625 


Total 






145 












9011 
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Figure 1 Map of the Columbia River Basin identifying locations of steelhead populations. Shading of the map reflects mean annual temperature 
maximums across the region from the period 1971-2000 (http://www.prism.oregonstate.edU/docs/meta/tmax_30s_meta.htm#7). Populations by 
DPS are: triangles - Snake River, circles - upper Columbia River, squares - middle Columbia River, stars - Upper Willamette River, and diamonds - 
lower Columbia River. Landmarks are: a - Quinault River, b - confluence of Willamette and Columbia rivers, c - Big White Salmon River, d - Klickitat 
River, e - Yakima River, f - confluence of Snake and Columbia rivers, g - confluence of Snake and Clearwater rivers, and h - confluence of Snake 
and Salmon rivers. 



via primers developed for EST sequence flanking regions. 
Generally, the coding or noncoding nature of specific ESTs, 
and associated functions were unknown (see references for 
additional information, Tables S2 and S3). Forward and 
reverse primer/probe sequences for Taqman assays are 
available in Ackerman and Campbell (2012). Polymerase 
chain reaction (PCR) amplification of loci followed the 
protocol of Hess et al. (2012) and included an initial pre- 
amplification step. Successful genotyping for a given sam- 
ple was defined proportionally as <10% missing data (i.e., 
fewer than 19 of 191 SNP genotypes per individual). Three 
species-specific SNP markers (Ocl_gshpx-357, Omy_my- 
clarp404-lll, and Omy_Omyclmk438-96) were used to 
screen for species ID and hybridization between O. mykiss 
and O. clarkii subspecies (Hess et al. 2012). All individuals 
identified as hybrids and congeners (n = 15 coastal, n = 79 
inland) were subsequently removed from the data set prior 
to analyses. Following this screening exercise, the three spe- 
cies-diagnostic loci were removed from the data set. 

Eight SNPs in our pared panel of 188 SNPs have been 
previously identified as a priori candidate loci for selection 



related to environmental variables. Specifically, two loci are 
putatively associated with thermal stress-induced mortality 
(Narum et al. 2013): Omy_hsp47-86 and Omy_OmyP9- 
180. Five SNP loci (Omy_aldB-165, Omy_gdh-271, 
Omy_Ogo4-212, Omy_stat3-273, and Omy_tlr5-205) have 
been previously identified as associated with temperature 
variation in desert versus montane environments, and one 
locus (Omy_hsf2-146) was putatively associated with pre- 
cipitation in the same study (Narum et al. 2010b). Two 
additional loci have been shown to differentiate anadro- 
mous and resident life-history types (Narum et al. 2011): 
Omy_ndk-152 and Omy_LDHB-2_i6. In the following sec- 
tions, these 10 loci are referred to as having 'precedence' in 
association tests (Table S3). 

Locus-specific allele frequencies (i.e., minor allele fre- 
quency; MAF), observed heterozygosity (H Q ) and F sr were 
generated with the program genalex version 6.2 (Peakall 
and Smouse 2006). Pairwise population F ST was calculated 
in genepop v. 3.3 (Raymond and Rousset 1995), and signifi- 
cant among-group variation was determined at P < 0.01 
using arlequin version 3.5 (Excoffier et al. 2005). Pairwise 
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stream distances between population pairs were calculated 
using arcmap and a GIS application developed by D. Graves 
(CRITFC), available at: http://www.critfc.org/fish-and-water 
sheds/ fishery- science/ data-resources-for-scientists/critfc-data- 
download/. Mantel tests of isolation by distance (IBD) were 
evaluated in genalex v.6.2 using matrices of pairwise P ST 
and pairwise stream distance. The Markov Chain Monte 
Carlo (MCMC) approximation of Fisher's exact test 
implemented in genepop v. 3.3 (1000 batches with 1000 
iterations) was used to test for deviations from Hardy- 
Weinberg equilibrium (HWE) expectations, evaluated 
across 188 SNP loci and 145 populations. Linkage disequi- 
librium was tested for all pairs of loci among populations 
using a simulated exact test in genepop. For all pairs of loci 
with significant nonrandom association (linkage), the locus 
with the lower MAF was excluded from further analyses. 
Statistical significance (a) was adjusted for the number of 
simultaneous tests (initial a = 0.05) for both HWE and 
linkage tests via the B-Y FDR method (Benjamini and Yek- 
utieli 2001) to reduce false-positive tests. 

Population structure within and among lineages 

Throughout our methods and analytical approach, the 
coastal and inland lineages were evaluated separately. First, 
we verified the resolving power of our SNP markers to dis- 
cretely differentiate the two steelhead lineages that have 
been characterized in previous studies based on other 
genetic markers (e.g., allozymes, Currens et al. 2009; mi- 
crosatellites, Blankenship et al. 2011), while confirming 
lineage of origin for each population. This test was con- 
ducted irrespective of classification of loci as neutral or 
non-neutral. The program structure version 2.3.4 (Prit- 
chard et al. 2000) was used to estimate the mean admixture 
proportions (Q), or fractional group fidelity among indi- 
viduals in each of 145 populations, setting k = 2 (two 
inferred groups; coastal and inland). Default settings were 
used for the ancestry model (i.e., admixture) and frequency 
model (i.e., correlated allele frequencies), with a burn-in of 
50 000 and 250 000 MCMC repetitions. Locus-specific and 
global pairwise Fst (9 of Weir and Cockerham 1984) were 
calculated within each lineage using genepop. A multivariate 
principle coordinates analysis (PCoA) was performed in 
genalex version 6.2 based on matrices of pairwise F S t to 
quantify amounts of variation. 

Identifying putatively neutral loci 

In classifying the nature of SNP loci, we define three possi- 
ble outcomes or distinctions. A locus may be (i) - unaf- 
fected by selection (neutral), (ii) - directly under selection 
(affecting one or more traits under selection), or (iii) - 
indirectly affected by the process of selection (selection at a 
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linked locus). In all subsequent discussion, the term 'non- 
neutral' variation will be used to represent the latter two 
distinctions. We used an outlier approach based on simula- 
tion methods initially proposed by Beaumont and Nichols 
(1996) to identify outlier SNP loci putatively under selec- 
tion. We conducted outlier tests as implemented in arle- 
quin version 3.5 (Excoffier et al. 2005) to test for 
excessively higher or lower P ST than would be expected 
under the assumptions of neutrality (Beaumont and Nic- 
hols 1996; Beaumont and Balding 2004). Tests were per- 
formed using 100 simulated demes and 50,000 coalescent 
simulations to generate a null distribution under neutral 
expectations around observed f ST values (with confidence 
intervals). Due to the potential high error rate of the hier- 
archical island model (Narum and Hess 2011), a finite 
island model was assumed for outlier tests. Average locus 
heterozygosity versus _F S t was plotted to represent the 1% 
and 99% quantiles, and loci lying below or above these 
quantiles were outliers putatively under balancing or direc- 
tional selection, respectively. 

Outlier loci can be neutral or non-neutral and therefore 
relying solely on outlier tests to characterize the selection 
candidacy of loci risks a resulting high rate of false posi- 
tives. For example, populations may be highly differenti- 
ated because of demography, or skewed patterns of 
isolation by distance (Akey 2009; Hermisson 2009; Narum 
and Hess 2011). In fractal landscapes, such as stream net- 
works, patterns of genetic structure can be coincident with 
patterns or complexity of stream branching, likely resulting 
in elevated variance around neutral F S t (i.e., false-positive 
outliers; Fourcade et al. 2013). For our exploration of non- 
neutral differentiation, outlier methods were used in com- 
bination with regression models with control variables to 
test environmental associations. This reduces the risk of 
false-positive conclusions (Matala et al. 2011), and aids in 
identifying the influences of environmental or habitat het- 
erogeneity on the spatial distribution of genetic diversity 
(inferred selection gradients). 

Defining physical and control variables in an association 
test framework 

Physical habitat variables used to gauge environmental het- 
erogeneity were chosen to represent regional and local cli- 
mate regimes. Latitude and longitude coordinates for each 
population were obtained from field data or were estimated 
using arcmap version 10 (copyright © 2010 ESRI) and were 
used to gather all physical variable measurements (Tables 
S4 and S5). Migration distance from the Columbia River 
estuary and pairwise stream distances (kilometers) were 
calculated using the GIS application previously described. 
Elevation was obtained using Google Earth (Image © 2012 
TerraMetrics) and verified for accuracy in arcmap version 
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10. Monthly averages for temperature and total precipita- 
tion measurements (rain and melted snow) were generated 
using parameter-elevation regressions on independent 
slopes model (PRISM) of the Oregon Climate Service; 
http://www.prism.oregonstate.edu/. Monthly average maxi- 
mum and minimum air temperatures were simulated at 
800-m cell resolution from a model based on climate 
normals from a 30-year period (1971-2000) in PRISM. 
Water temperature readings were unavailable, and there- 
fore, we used air temperature readings as a proxy for 
in-stream temperature. In the Pacific Northwest region, 
stream temperature trends closely with air temperature, 
particularly on temporal scales (Isaak et al. 2012b). How- 
ever, changes in stream temperatures occur more slowly 
and are affected by groundwater and riparian buffering, 
glaciation, and complex topography. In total, five physical 
variables were used to characterize local and regional cli- 
mate: precipitation, temperature, elevation, geographic 
coordinates, and migration distance. We expected autocor- 
relation between all five variables. The precipitation and 
temperature variables we evaluated on the basis of annual 
and seasonal measurements. Geographic coordinates and 
migration distance add directional elements related to 
regional weather patterns. 

The highly dependent nature of f ST on demographic his- 
tory may confound the ability to accurately identify selec- 
tion candidate loci (Beaumont 2005). When isolating 
forces appear correlated with variation across the landscape 
(Narum and Hess 2011), it can be difficult to confidently 
differentiate demographic influences from non-neutral 
(putatively adaptive) associations. Prior to conducting 
association tests, we assembled panels of putatively neutral 
SNP loci (defined by exclusion of outliers) for each lineage, 
and as a conservative measure, we established control vari- 
ables for underlying neutral population structure. Using 
structure version 2.3.4 (Pritchard et al. 2000), we evalu- 
ated k ranging from 1-8 clusters for each lineage, and the 
most likely number of distinct populations (k) was selected 
based on five iterations for each potential k value. We used 
program default settings for the Monte Carlo Markov 
Chain procedure, with 30 000 burn-in followed by 150 000 
MCMC repetitions. The (Ak) statistic derived from the sec- 
ond order rate of change of the likelihood function (Evan- 
no et al. 2005) provided an improved estimate of the mode 
of true (k), and the mean Q values for each population 
were established using the full search algorithm in 
CLUMPP (lakobsson and Rosenberg 2007). Secondly, a 
pairwise population F sx matrix was generated in genepop to 
conduct PCoA in genalex version 6.2. The resulting (Q) 
inferred admixture proportions from structure and the 
first three Eigen vectors (EV) from PCoA analysis were 
used to control for underlying neutral population differen- 
tiation in subsequent association tests. 
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Identifying candidate loci: association with environment 

We used two regression methods to identify potential can- 
didate markers in association with the previously described 
environmental or climate-related variables. First, popula- 
tion MAF at each locus was plotted against each of five 
physical variables (Table S2) in univariate linear regression 
analysis. The P-values for the correlation coefficient were 
calculated in Microsoft Excel. Statistical significance 
(a = 0.05) was adjusted for the number of simultaneous 
tests as a conservative measure using a Bonferroni correc- 
tion to exclude false-positive results (Rice 1989). Eighteen 
populations from the genetically distinct Clearwater River 
are located in a geographic region characterized by high 
precipitation. All 18 ranked in the top 20 for greatest 
amount of spring and summer precipitation among 121 
inland populations (variable nonindependence). In initial 
univariate regression analyses, 86 loci (48%) showed signif- 
icant correlation with either spring or summer precipita- 
tion. Therefore, as a precautionary measure to guard 
against what appeared to be a spurious geographical influ- 
ence, regression coefficients were recalculated in the 
absence of the Clearwater River group. Then, of the origi- 
nal 86 loci only those that maintained a significant correla- 
tion with precipitation were considered 'true' correlations. 

Second, we used DISTLM forward (McArdle and Ander- 
son 2001) to perform multivariate multiple regression with 
permutation tests on locus-specific pairwise -Fst distance 
matrices versus pairwise matrices of values for physical 
variables (Carmichael et al. 2007; Olsen et al. 2010; Hess 
et al. 2011; Matala et al. 2011). Conditional tests that per- 
mute residuals under a reduced model (Anderson and ter 
Braak 2003) were performed in DISTLM forward using a 
stepwise forward selection procedure that fits individual 
variables or sets of variables sequentially in the linear 
model. Once the most informative variable or set of vari- 
ables (explaining greatest amount of variation) is estab- 
lished, the remaining variables are fit to the model, while 
those selected in previous steps are held as constants. Spe- 
cific associations were determined on the basis of highest 
rank (P < 5%) variable for a given locus, and the method 
accounts for correlations that are likely present between the 
variables used to characterize climates (Tables S4-S7). The 
temperature, precipitation, and neutral control (Q and EV) 
variables were evaluated in respective 'sets' as demonstrated 
by Anderson et al. (2004), where seasonal averages for the 
former two were defined as: winter (lanuary-March), 
spring (April-Iune), summer (July-September), and fall 
(October-December). 

In summary, the first two criteria used to flag loci for 
further consideration as candidates in our multitest process 
of categorization were as follows: (i) significant F ST outliers 
at a 99% confidence threshold or (ii) loci with precedence 
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of association [see Table S3]. All 180 loci were tested for 
correlation based on linear regression (criterion #3). Lastly, 
all loci flagged as candidates based on the first three criteria 
were scrutinized on the basis of multivariate multiple 
regression with permutation tests (criterion #4). Note that 
as the final discriminatory step, when multivariate regres- 
sion tests ranked a neutral control variable (Q or EV) as 
highest in significance among all tested variables, the locus 
in question was precluded from categorization as non-neu- 
tral (selection candidacy) regardless of results based on the 
other three criteria (e.g., linear correlation with environ- 
ment). This approach resulted in three subpanels of SNP 
loci for genetic analysis: candidate loci - those meeting 
stipulations outlined previously, neutral loci - those show- 
ing strongest correlation with neutral structure, and 
'ambiguous' loci. The latter category of SNP was primarily 
comprised of F sr outlier loci or loci having precedence of 
association which ultimately failed to reach non-neutral 
status based on regression analysis. Ambiguous loci (by 
definition) could not be confidently characterized as neu- 
tral or non-neutral given conflicting test results and were 
therefore excluded from subsequent evaluations of genetic 
differentiation. Candidate loci putatively under selection 
were subsequently used to evaluate non-neutral or adaptive 
variation. Putatively neutral loci were used to evaluate 
underlying neutral (demographic) population structure. 

Comparing neutral and putatively adaptive variation 

Using the newly established neutral and candidate SNP 
panels, comparisons were drawn based on multiple analyses 
to show the corresponding amount of variation described 
by both neutral and non-neutral variation (Nosil et al. 

2007) . Nonparametric Mantel tests for isolation as a func- 
tion of environment were conducted using 9999 permuta- 
tions of pairwise matrices of _F ST (within lineages) against 
absolute pairwise difference in values for environmental 
variables. Nei's standard genetic distance (Nei 1972) was 
calculated for each lineage, and distance was displayed in 
the topology of an un-rooted neighbor-joining (NJ) tree 
using the analysis program phylip version 3.68 (Felsenstein 

2008) . The SEQBOOT option was implemented to generate 
1000 simulated data sets, and a consensus topology with 
bootstrap support was generated using the CONSENSE 
option. The program treeview version 1.6.6 (Page 1996) 
was used to graphically display the trees. Different trees 
were generated using either neutral or candidate SNP pan- 
els for each lineage. 

A ranking method was employed based on averages for 
five climate variables to compare climate similarities that 
occurred in the clustering of populations within tree topol- 
ogies. Specifically, populations were ranked in descending 
order (warmest to coolest) using corresponding mean 



maximum temperatures for both spring and summer inde- 
pendently. Elevation was ranked in ascending order assum- 
ing lowest elevation equals warmest climate. Lastly, 
populations were ranked in ascending order for mean pre- 
cipitation (least equal to driest/warmest climate) in both 
spring and summer independently. Following independent 
ranking of populations for the five variables, the average 
rank across all five was used to order populations from 
most hot and dry to most cold and wet, and to compare 
relative climates in the tree topologies. 

Results 

Descriptive statistics and population differentiation 

Data quality analyses indicated only minor issues concern- 
ing locus scoring accuracy, nonrandom association of loci, 
and population admixture. We observed departures from 
expected genotypic proportions in 208 of 27 260 tests (188 
loci x 145 populations) at an adjusted significance thresh- 
old of P = 0.0046. Generally, the HWE deviations were not 
specific to any population or locus, spanning 100 of 145 
populations, and 109 of 188 loci. Exceptions occurred in 
both Abernathy Creek (Ref. #12) and Canyon Creek (Ref. 
#9), each with 10 population-specific departures. There 
were also 12 HWE departures at locus OMS00087, which 
was therefore removed from all subsequent analyses. Tests 
for linkage disequilibrium revealed five pairs and one trio 
of loci that remained significantly out of equilibrium in at 
least 10% of populations after adjustment for multiple tests 
(P- value < 0.0001). Linked SNP pairs were as follows: 
(OMS00133 and Omy_rapd-167), (Omy_CRBFl-l and 
Omy_crb-106), (Omy_Il-lb_.028 and Omy_Illb-198), 
(Omy_SECC22b-88 and OMS00169), (Omy_ndk-152 and 
Omy_u09-52.284), and the trio (Omy_GHSR-121, 
OMS00176 and Omy_mapK3-103). In each pair, only the 
locus with the highest MAF was retained in the data set 
(Table S2). 

Following paring of eight loci for HWE and linkage dis- 
equilibrium, the final data set included 180 loci for use in 
subsequent analyses. No SNP locus exhibited fixed allele 
frequencies, and variability ranged widely both within and 
between steelhead trout lineages. Among 24 coastal lineage 
populations, the mean observed heterozygosity ranged 
from H 0 = 0.002 at locus Omy_pad-196 to H G = 0.531 at 
locus OMS00101 (overall mean H D = 0.315). Among 121 
inland lineage populations, the range was H D = 0.027 at 
locus Omy_nach-200 to H 0 = 0.513 at locus Omy_IL17-185 
(overall mean H D = 0.302; Table S2). Among coastal line- 
age populations, we observed a mean locus MAF of 0.246 
ranging from 0.001 at Omy_impal-55 to 0.498 at 
Omy_arp-630. The mean MAF across inland lineage popu- 
lations was 0.236, ranging from 0.024 at Omy_nach-200 to 
0.497 at OMS00070. Generally, large MAFs among 
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populations within lineages are indicative of highly 
differentiated populations. The overall mean pairwise F S r 
between all coastal populations was 0.044. Population-spe- 
cific mean values among 24 populations ranged from pair- 
wise F ST = 0.029 (refJ's 17) to pairwise F ST = 0.079 
(ref.#'s 11). With the exception of one pair of populations 
(ref.#'s 3 & 4; Eagle Creek, North Fork Eagle Creek), all 
comparisons indicated significant among-group variation 
(P < 0.001). The overall mean pairwise F ST between all 
inland populations was 0.045. Population-specific mean 
values among 121 populations ranged from pairwise 
F ST = 0.025 (ref.#'s 33) to pairwise F ST = 0.148 (ref.#'s 77), 
and all pairwise population comparisons indicated signifi- 
cant among-group variation (P < 0.001). Significant isola- 
tion by distance (IBD) was observed in both lineages: 
coastal (R 2 = 0.245, P < 0.0001) and inland (R 2 = 0.083, 
P < 0.0001). 

Differentiating populations by lineage 

When all coastal lineage populations were combined and 
all inland lineage populations were combined to form two 
groups (corresponding to lineage), we observed distinct 
allele frequency differences over the panel of 180 loci. Allele 



frequencies ranged from a low of 0.002 at Omy_1045 19-624 
to a high of 0.719 at Omy_ndk-152. At 41 different (bi-alle- 
lic) loci the minor allele in one lineage was the opposite 
(major allele) for the other lineage. The overall mean pair- 
wise F sr for comparisons between coastal and inland popu- 
lations was 0.145. Population-specific means for 
interlineage comparisons ranged from pairwise F S t = 0.053 
(ref.#'s 25: Bowman Creek) in the inland lineage, to pair- 
wise F S r = 0.228 (ref.#'s 33; Upper Trout Creek) also an 
inland lineage population (data not shown). With the 
exception of 15 populations along the crest of the Cascade 
Mountains, the 145 populations in our analyses formed 
defined clusters according to lineage, where the first PCoA 
plot axis separated lineages and explained 63.4% of the 
total variation in the data. In Bayesian clustering analyses, 
the mean admixture proportion (Q) for two inferred popu- 
lations was 95.3% in Ql for coastal populations, and con- 
versely, 93.8% for Q2 among inland populations (Fig. 2). 
Mean values would likely have been higher, but popula- 
tions adjacent to the crest of the Cascade Mountains 
(demarcating range limits of coastal and inland types) 
appeared admixed between lineages to varying degrees, 
substantially lowering inferred group fidelity. The genetic 
characterization of the Big White Salmon River population 
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Figure 2 Mean admixture proportion plot of structure version 2.3.4 (Pritchard et al. 2000) generated from 180 SNP loci. The histogram shows admix- 
ture proportions for k = 2 inferred groups represented by black bars (coastal proportion) and gray bars (inland proportion). Populations correspond 
to reference numbers (Table S1) in ascending order from left to right (except Ref.#1 Quinault in the 12th position), and are arranged by DPS. Popula- 
tions at the western extreme of the inland range, and eastern extreme of the coastal range are identified in relation to (adjacent) the Cascade Moun- 
tain crest. 
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(Fig. 1; c) was significantly more similar to the coastal line- 
age despite its location among the middle Columbia River 
DPS and was therefore evaluated throughout these analyses 
as a coastal population. 

Controlling for underlying neutral differentiation in 
association testing 

Outlier tests revealed eight loci putatively under directional 
selection in the coastal lineage and 10 in the inland lineage 
(Tables S3, S6, and S7). No influence of balancing selection 
was observed. The remaining putative neutral loci in the 
inland lineage (n = 170) and coastal lineage (n = 172) were 
used in PCoA and Bayesian cluster analyses to establish 
neutral control variables. For both lineages, the most likely 
number of inferred populations was k = 2. However, this 
was likely a result of the relatively deep divergence between 
the Willamette River DPS and remaining lower Columbia 
River populations in the coastal lineage (ICTRT 2003), and 
similar divergence between the genetically distinct Klickitat 
River subbasin and remaining inland populations. To eval- 
uate structure at a finer scale, we chose the appropriate 
number of inferred groups from the next peak in the 
second order rate of change (Ak) for log e [Pr{_K"}], which 
occurred at k = 6 (Q1-Q6) within each lineage indepen- 
dently (Table S4). 

For the coastal lineage, mean Q partially distinguished 
summer-run populations, a less common life-history 
type among this lineage (Q5 = 65%), two groups in the 
upper Willamette DPS (Q2 = 58% and Q4 = 61%), the 
Washington coast population from the Quinault River 
(Ql = 89%), and the Big White Salmon population on 
the Cascade Crest (Q6 = 81%). Most of the remaining 
lower Columbia populations had greatest proportions in 
Q3, ranging from 34% to 77%. For the inland lineage, 
the six inferred populations partially distinguished major 
subbasins, where the middle and south forks of the Sal- 
mon River were both dominant in Ql (74%), the south 
and middle forks of the Clearwater River in Q4 (53%) 
and Q3 (76%) respectively, the Klickitat River in Q2 
(68%), and the upper Salmon River in Q5 (56%). 
Regions with highest mean admixture proportion in Q6 
included the majority from the upper and middle 
Columbia, and the Grande Ronde and Imnaha rivers in 
the Snake River Basin. Most individual populations 
within these regions did not exhibit definitive or strong 
fidelity in Q6 (ranging 27-68%, with a mean of 44%), 
and the highest admixture proportion by watershed 
occurred in the Yakima and John Day Rivers, with mean 
Q6 of 59% and 54%, respectively (Table S4). Differentia- 
tion was slightly higher within the inland lineage, with a 
larger range of pairwise population F ST (0.0002-0.1889, 
mean 0.0421) compared to values within the coastal 
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lineage (range 0.0001-0.1098, mean 0.0413). Results of 
PCoA based on pairwise F S t corroborate primary dis- 
tinctions revealed from structure analyses. Among the 
inland lineage, observed distinctions coincide well with 
several of the MPGs within the Snake River DPS. 

Association tests to identify non-neutral loci: putatively 
adaptive variation 

The adjusted probability threshold for determining signifi- 
cant association using linear regression was P < 0.00029. 
On the basis of linear regression alone, we identified 12 loci 
in the coastal lineage that were significantly correlated with 
one or more environmental variables (Table 2; Table S3). 
In the inland lineage, 59 loci were identified as significantly 
correlated, including 17 loci for spring precipitation, and 
21 loci for summer precipitation. From combined results 
using outlier tests, association precedence, and linear 
regression, we ultimately flagged 26 loci in the coastal line- 
age and 62 loci in the inland lineage for further examina- 
tion of environmental correlation. Following the final 
ranking phase based on multivariate regression, the pared 
locus classifications included nine candidate loci putatively 
under selection in the coastal lineage and 22 candidates in 
the inland lineage (Table 2). All remaining loci were con- 
sidered either neutral or ambiguous (Table S3). The group 
of loci ultimately classified as ambiguous were typically 
comprised of outlier loci for which the highest ranked vari- 
able was one of our control variables for neutral differentia- 
tion (Q or EV; Table S4). The SNPs deemed ambiguous 
may indeed be under selection, but we failed to identify 
associations with the suite of environmental variables tested 
here. 

Of the eight SNP loci showing precedence as candidates 
for local climate from previous studies, seven exhibited 
significant association with climate variables in at least one 
lineage in our broad-scale evaluation across many 
populations. Two loci previously identified as associated 
with survival under thermal stress (Omy_hsp47-86 and 
Omy-P9-180; Narum et al. 2013), were significantly associ- 
ated with either temperature or precipitation (Fig. 3). Two 
loci previously identified as candidates for temperature 
(Omy_stat3-273 and Omy-gdh-271; Narum et al. 2010b) 
were found to be highly correlated with precipitation in at 
least one lineage (Tables 2 and S7), but not for tempera- 
ture. Previous associations of locus Omy_hsf2-146 with 
precipitation, and locus Omy_aldB-165 with temperature 
(Narum et al. 2010b) were corroborated within the coastal 
lineage. Finally, locus Omy_tlr5-205 previously associated 
with temperature (Narum et al. 2010b) was significant for 
temperature and precipitation in the inland lineage. 
Notably, two of these loci (Omy-P9-180, Omy_stat3-273) 
were candidates for climate association in both lineages. 
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Table 2. Candidate SNP markers associated with climate for O. mykiss in the Columbia River Basin. 



SNP locus 


Notes 


Inland lineage 


Climate associations 


Coastal lineage 


Climate associations 


Omy_hsc7 15-80 




Candidate 


P 




Neutral 




Omy_SECC22b-88 




Candidate 


P 




Neutral 




OM500014 




Candidate 


P 




Neutral 




OMS00062 




Candidate 


P 




Neutral 




OMS00151 




Candidate 


T 


P 


Neutral 




Omy_97660-230 




Candidate 


T 


D 


Neutral 




Omy_CRBF1-1 




Candidate 


P 


D 


Neutral 




Omy_e1-147 




Candidate 


D 


T 


Neutral 




Omy_GHSR-121 




Candidate 


P 




Neutral 




Omy_IL6-320 




Candidate 


T 


P 


Neutral 




Omy_metA-161 




Candidate 


T 


P 


Neutral 




Omy_nkef-241 




Candidate 


D 




Neutral 




Omy_ntl-27 




Candidate 


D 


P 


Neutral 




Omy_u09-53.469 


0, 


Candidate 


T 


P D 


Neutral 




Omy_UT16_2-173 


0, 


Candidate 


D 




Neutral 




OMY1011SNP 




Candidate 


P 




Neutral 




Omy_hsp47-86 


(*) 


Candidate 


T 


D 


Ambiguous 




Omy_tlrB-205 


(!) 


Candidate 


T 


P D 


Ambiguous 




Omy_gdh-271 


(f) 


Candidate 


E 




Ambiguous 




Omy_97954-618 


0 C 


Candidate 


P 


D 


Ambiguous 




Dmv OmvP9-180 

\—t iiiy \-/ 1 1 1 y r zf low 


(*) 


C ^nHiH^tp 


T 


P 


Csnrlirl^tp 

^ d 1 LIILICilC 


j 


Omy_stat3-273 


(t) 


Candidate 


P 




Candidate 


P 


Omy_aldB-165 


(t) 


Ambiguous 






Candidate 


T 


Omy_hsf2-146 


(t) 


Ambiguous 






Candidate 


P D 


OMS00008 




Neutral 






Candidate 


P 


OMS00058 




Neutral 






Candidate 


T D 


OMS00111 




Neutral 






Candidate 


P D 


Omy_bcAKala-380rd 


o c 


Neutral 






Candidate 


T D 


Omy_cox1-221 




Neutral 






Candidate 


DTP 



Notes include: inland '0\' and coastal outlier loci '0 Ci ' and reference numbered 'precedence' loci: '*'. Thermal stress association (Narum et al. 2013); 

'f. Temperature or precipitation association (Narum et al. 2010b). 

Climate associations are: P, precipitation; T, temperature; E, elevation; D, distance. 



Additional novel candidate loci were detected that have not 
previously been identified as loci putatively under 
selection. 

The two loci that were previously determined to be 
associated with differentiating resident from anadromous 
life-history forms (Omy_ndk-152 and Omy_LDHB-2_i6; 
Narum et al. 2011) could not be directly tested for associa- 
tion with life-history forms because most populations in 
our analyses were field identified as anadromous (e.g., juve- 
nile smolt or adult steelhead phenotypes). However, 
Omy_ndk-152 was a significant F ST outlier. A single locus 
(Omy_Ogo4-212) with precedence of putative association 
for climate or life history had no confirmed associations for 
any habitat variable in our analyses and was deemed 
ambiguous as it may be a candidate at smaller, more local 
scales. 

Among coastal lineage populations, precipitation and 
distance from the ocean (i.e., migration distance, and 
lat/long coordinates) were equally the most commonly 
correlated environmental factors. To clarify, the com- 



mon point of origin for all measured migration dis- 
tances was the Columbia River estuary, and therefore, 
distance associations were not necessarily an example of 
IBD gene flow which relates to direct distance between 
populations. In the inland lineage, environmental corre- 
lations were dominated by precipitation, then tempera- 
ture. Specifically, spring, summer and total annual 
precipitation were the physical variables most often 
associated with genetic variation, and several signifi- 
cantly correlated loci spanned both lineages (Fig. 3). In 
some cases, the ranking of variables in multivariate 
regression was inconsistent between tests of individual 
variables versus sets of variables. For example, while an 
individual variable (e.g., Ql, summer precipitation) may 
have ranked highest for a given locus, the correspond- 
ing variable sets (e.g., Q1-Q6, total precipitation) may 
have ranked low for that same locus (Table S6 and S7). 
An explanation for this outcome, centered on differen- 
tial environmental influences among life-history stages, 
follows in the discussion. 
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Figure 3 Linear regression plots, representing results for eight of the most highly correlated SNP loci (see Table 2). Plots A-C are temperature associ- 
ations in the coastal lineage, while D-H are precipitation associations in the inland lineage. Symbols correspond with DPS: diamond - lower Columbia 
and Southwest Washington, star - upper Willamette, square - middle Columbia, circle - upper Columbia, and triangle - Snake. 



Comparing neutral and non-neutral differentiation 

The NJ tree topologies based on neutral SNP panels 
generally showed genetic distance relationships among 



populations that accurately aligned with the five DPS 
delineated under the ESA (Figs 4A and 5A). However, 
in the coastal lineage, the Clackamas River populations 
(Lower Columbia River DPS) grouped closely with 
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populations in the upper Willamette River DPS, while 
populations in the upper Willamette River west side 
tributaries were distinctly partitioned from upper Wil- 
lamette east slope populations (Fig. 4A). The known 
summer-run populations in the coastal lineage cluster 
together with significant bootstrap support, rather than 
clustering with winter-run populations from the same 
tributaries (i.e., Kalama and Hood rivers). Middle and 
Upper Columbia MPGs and five MPGs in the Snake 
River (Ford 2011) are well differentiated within the 
inland lineage, although bootstrap support was minimal 
in some instances. Finer scale definition observed in the 
Salmon and Clearwater rivers indicates significant 
genetic distinctions between middle and south fork 
populations from both subbasins (within MPGs), and 
between each of those groups and corresponding popu- 
lations in the lower sections of both subbasins 
(Fig. 5A). These within- watershed distinctions in both 
the Clearwater River and Salmon River subbasins are in 
agreement with previous reports (Nielsen et al. 2009), 
but the level of resolution that differentiates the Clear- 
water River subbasin from the Salmon River subbasin 
has not been previously reported. In the tree topology 
for neutral structure in the inland lineage, populations 
within MPGs or subbasins were also frequently charac- 



terized by climate similarity. However, deviations from 
ESA- or regionally based (e.g., distance) clustering were 
rare regardless of differences or similarities in climate. 

The NJ tree topologies based on non-neutral structure 
were comprised of all candidate SNPs in the coastal (n = 9 
loci) and inland (n = 22 loci) lineages. Trees presented 
population clustering patterns that often did not corre- 
spond with delineations of DPS or MPG (Figs 4B and 5B), 
albeit with limited bootstrap support. This was presumably 
a function of topologies based on small numbers of loci. 
Climate based clustering patterns were more apparent in 
the inland lineage, where several of the relationships 
depicted in the non-neutral tree topology conformed to 
warm or cool climate similarity irrespective of subbasin or 
MPG distinction. For example, the neutrally distinct Klicki- 
tat River and Yakima River groups were each split to show 
some association to climate, most Clearwater and Salmon 
river groups with similar climates were combined on the 
same basal node or primary branch of the tree, and the five 
lower Clearwater River populations were differentiated by 
climate distinctions. Overall, NJ results based on non-neu- 
tral SNPs for the coastal lineage make it difficult to discern 
any basis for population similarities within branching pat- 
terns. In particular, climate rankings for coastal popula- 
tions were noninformative for understanding genetic 
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Figure 4 Neighbor-joining trees depicting Nei's genetic distances among coastal lineage populations based on (A) neutral variation - 1 58 SNPs, and 
(B) non-neutral variation - nine SNPs. Bootstrap support exceeding 50% appears at nodes. Diamonds represent the Lower Columbia DPS, while stars 
represent the Upper Willamette DPS. Numbers in parentheses correspond to population reference numbers (Table S1). 



©201 4 The Authors. Evolutionary Applications published by John Wiley & Sons Ltd 7 (2014) 682-701 



693 



Adaptive differentiation of Columbia River steelhead 



Matala et al. 




Figure 5 Neighbor-joining trees depicting Nei's genetic distances among inland lineage populations. Numbers at branch ends correspond to popula- 
tion reference numbers (Table S1) Trees are based on (A) neutral variation - 146 SNPs, and (B) non-neutral variation - 22 SNPs. Bootstrap support 
exceeding 50% appears at nodes. Population symbols correspond with DPS: triangles - Snake River, circles - upper Columbia River, squares - middle 
Columbia River. Major drainage sub-basins are labeled on the left. The symbol (*) represents climate related distinction among Lower Clearwater 
River populations and (£) represents differences among Yakima River populations based on climate. The symbol (®) identifies the branch adjoining 
Clearwater River and M. F. Salmon River populations. The color scale indicates population ranking by climate (1 = coldest, 115 = warmest) for 1 1 5 
inland populations identified by reference number (Table S1). 



distance relationships associated with physical variables 
(Fig. 4A; climate ranks not shown). 

Allele frequencies were most commonly correlated with 
the precipitation variables in association tests. This was fur- 
ther verified with Mantel tests of isolation by precipitation 
(IBP). Tests were based on a five-locus subset of nine can- 
didate loci in the coastal lineage and a 15 -locus subset of 22 
candidate loci in the inland lineage; subsets were chosen 
based on significant associated with precipitation specifi- 
cally (Table 2; Table S6 and S7). For the coastal lineage, a 
test based on all five loci indicated significant IBP for mean 
annual precipitation (R 2 = 0.235; P = 0.0009), mean 
spring precipitation (R 2 = 0.201; P = 0.0006), and mean 
summer precipitation (R 2 = 0.296; P = 0.0004). The panel 
of 158 neutral loci for the coastal lineage was also tested to 
account for potentially confounding neutral patterns of 
IBP, but none were observed. Mantel tests of IBP based on 
15 precipitation-associated loci in the inland lineage (pair- 
wise comparisons between 121 populations) indicated 



significant IBP for mean spring precipitation (R 2 = 0.039; 
P = 0.0001), and for mean summer precipitation 
(R 2 = 0.037; P = 0.0001); no correlations were observed 
between seasonal precipitation and the panel of inland 
lineage neutral SNPs (n = 146). 

Discussion 

This study demonstrates patterns of neutral and non-neu- 
tral variation in O. mykiss at a broad geographic scale 
that will be a valuable contribution to improved conser- 
vation management of this species in the CRB. Neutral 
structure was complementary to preceding studies (Win- 
ans et al. 2004; Currens et al. 2009; Blankenship et al. 
2011) and confirms the existence of two deeply divergent 
steelhead trout lineages (i.e., coastal and inland) across 
the species range, along with finer scale population struc- 
ture. We identified neutral divergence among steelhead 
trout populations that coincides substantially with current 
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DPS delineations among lineages, as well as MPGs (ICT- 
RT 2003; Good et al. 2005; Ford 2011) demarcated within 
the Snake River Basin (five MPGs), Middle Columbia 
River (four MPGs) and Upper Columbia River (one 
MPG). However, using our SNP panel, fine-scale resolu- 
tion of neutral differentiation was detected in the Snake 
River DPS at a level previously unreported based on other 
marker types. Similar to other studies (Nielsen et al. 
2009; Campbell et al. 2012), we found no evidence for 
multiple evolutionary lineages in the Snake River, but 
local environments may influence their differentiation. 
For reference, six Snake River hatchery stocks were genet- 
ically similar to their natural-origin counterparts that dis- 
tinguish the Snake River DPS. In the coastal lineage, 
neutral differentiation among populations from east and 
west side tributaries in the upper Willamette River does 
not appear to fit DPS unit distinctions and may warrant 
further investigation to define conservation boundaries. 
From our analyses, the Big White Salmon river popula- 
tion is currently the only population within the Middle 
Columbia River DPS that is more consistent with a 
coastal lineage origin, suggesting further evaluation of its 
classification may be necessary. 

Additionally, we show clear evidence for non-neutral 
(putatively adaptive) variation that is significantly 
associated with climate in the region. Specifically, several 
candidate markers were primarily associated with 
precipitation and temperature. This study demonstrates 
that candidate markers can be applied at broad geographic 
scales to describe the extent of potential local adaptations 
across highly variable climates. To evaluate non-neutral 
differentiation, our designation of candidate SNP loci was 
applied conservatively to reduce false-positive results. Con- 
clusions of climate association were based on a large and 
diverse number of populations and supported by a frame- 
work of multiple test criteria and strict likelihood thresh- 
olds (Balkenhol et al. 2009; Schoville et al. 2012). De Mita 
et al. (2013) suggest using multiple robust methods, and 
emphasize numbers of populations over numbers of 
individuals per population for improved confidence in 
determining selection candidacy of loci. A greater number 
of candidate loci were discovered in the inland lineage, rep- 
resented by a larger number of populations than were 
observed in the coastal lineage. In addition, the interior 
region of the CRB is larger, characterized by a highly vari- 
able environment relative to the coastal environment 
(greater range of wet/dry and warm/cool conditions). Sev- 
eral loci were identified in association with tested environ- 
ment or landscape variables, but precipitation and 
temperature proved to be the most common (# loci) and 
strongest factors in non-neutral population differentiation 
that spanned both lineages (e.g., at SNP Omy_stat3-273, 
Omy_OmyP9-180). 
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Although our association tests indicated relationships 
between environmental variation and genetic heterogeneity 
(i.e., allele frequency variation), it is challenging to decipher 
the biological relevance of those correlations. Climate vari- 
ables such as temperature may affect behaviors and pheno- 
types alike (Perry et al. 2001; Zydlewski et al. 2005), 
sometimes relatively rapidly in response to perturbations 
(Kovach et al. 2012). Previous landscape genetics studies in 
salmonids have shown significant allele frequency correla- 
tion with precipitation that have been described as neutral 
influences on population structure (Narum et al. 2008; 
Blankenship et al. 2011; Olsen et al. 2011). Olsen et al. 
(2011) for example, present a compelling discussion on pos- 
sible correlations between precipitation and gene flow. This 
may occur if increased flooding results in decreased stream 
stability, which in turn may affect fish dispersal. Alterna- 
tively, the similarity of outcomes across study locales may 
suggest that the distribution of precipitation across the land- 
scape elicits, or is indicative of common adaptive responses. 
For example, streambed scour related to rain-on-snow 
events may have lesser impact on fish that adapt by burying 
eggs at deeper depth (Goode et al. 2013). Thus, climatic 
variables such as precipitation may impart either 'neutral' 
landscape effects, 'non-neutral' (putatively adaptive) land- 
scape effects, or both. In either case, one could argue that 
adaptation and selection play a key role in shaping the 
genetic landscape, which is more apt to be revealed through 
non-neutral genetic variation (Limborg et al. 2011). 

The genetic population structure we observed differed 
markedly depending on whether our evaluations were 
based on neutral or non-neutral differentiation. Neutral 
differentiation generally reflected a pattern of distance- 
restricted gene flow, and in the context of demographic 
factors, population clustering was relatively intuitive 
within and among regions (clustering by subbasins, DPS, 
etc.). In contrast, population-clustering patterns identified 
using candidate loci (non-neutral differentiation) were 
presumably based upon environmental variation and fre- 
quently did not align with current steelhead trout DPS 
delineations. Moreover, clustering similarities based on 
non-neutral variation did not necessarily coincide with 
geographic proximity, nor were patterns among popula- 
tions always transparent in regard to biology (e.g., coastal 
lineage migration timing). Thus, the juxtaposition of 
genetic signals show that neutrally dissimilar populations 
may exhibit non-neutral similarity (and vice versa) related 
to environment, and irrespective of geographic distance. 
Compared with neutral variation, candidate loci revealed 
some novel distinctions between populations, presumably 
reflective of environmental differences between regions 
and locales, particularly for the inland lineage. For the 
coastal lineage, less variable environments may produce 
moderate selective pressure for local adaptation among 
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surveyed portions of the species range. More extreme 
environments in the southern portion of the coastal line- 
age range (e.g., the Sacramento River system) might be 
expected to provide stronger selective pressure. 

The relative influences of climate or environment on 
genetic variation (e.g., putative adaptive responses) may 
occur during particular life-history stages of an organism, 
which can be difficult to discern. Temperature for example 
has been shown to have variable effect on different life 
stages of fish (Fowler et al. 2009). In our multivariate 
regression analysis, we noted inconsistencies between tests 
on individual physical variables and corresponding sets of 
seasonal variables. Seasonal environmental variation may 
impart a disproportionate selective influence coincident 
with age related behaviors (e.g., emergence time or outmi- 
gration time; Coleman and Fausch 2007). Our results of 
non-neutral differentiation indicate that precipitation dur- 
ing specific juvenile rearing or adult spawning periods may 
be more effectual or correlative than average annual fluctu- 
ations in precipitation. 

If correlations between loci and physical variables are 
indicative of an adaptive influence (Bonin et al. 2009), they 
are not necessarily representative of direct causal relation- 
ships (e.g., selection for specific phenotypes). Note that 
from among our panel, the locus Omy_stat3-273 was previ- 
ously identified among desert and montane resident 
O. mykiss populations as being associated with temperature 
(Narum et al. 2010c). We demonstrated in our evaluation 
that this locus was correlated with precipitation but not 
with temperature, yet in actuality it is likely associated with 
overall climate and thus affected by myriad aspects of habi- 
tat variability. We identified patterns of putatively adaptive 
variation associated with climate, but corresponding phe- 
notypic trait variation was not measured. However, distin- 
guishing whether phenotypic changes are genetically based 
or the result of phenotypic plasticity has proven difficult 
(Merila and Hendry 2014). Often many genes are involved 
in adaptive responses to specific environments and/or cli- 
mates (Kassahn et al. 2007) and controlled experiments 
would be necessary to make direct inferences on interac- 
tions between environments, phenotypes, and specific 
genes. Rather than providing unequivocal evidence of 
adaptation on the basis of phenotypic attributes, our study 
identifies locus associations that can be seen as indicators 
of related but undetermined causative environmental 
forces, such as early seasonal onset (Bradshaw and Holzap- 
fel 2008). For example, distinct populations of steelhead 
trout in Oregon's Hood River occupy either glacial fed or 
spring fed tributaries (Underwood et al. 2003; Matala et al. 
2009). Distinctions likely arose due in part to selection in 
variable environments, characterized by in-stream flow 
rate, thermal stability, and other stressors (Lytle and Poff 
2004). When altered by the forces of climate change (e.g., 
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lengthening seasons) those environments may in turn affect 
phenotypes such as spawn timing or migration timing 
(Crozier et al. 2011; Reed et al. 2011). Nevertheless, it can- 
not be stated unequivocally that climate associations are 
indicative of loci under direct selection. 

Global climate change and effects of climate on ecosys- 
tems has earned the attention of the scientific community 
and freshwater fish are expected to be negatively impacted 
(McCullough et al. 2009). There is growing consensus that 
habitats occupied by salmonid species in North America 
and Europe will or have already experienced climate related 
alterations (Crozier and Zabel 2006; Battin et al. 2007; Win- 
field et al. 2010; Nielsen et al. 2013). Most of the CRB has 
been identified as habitat at high risk of thermal stress in sal- 
monids (Wu et al. 2012), and conservation of many popula- 
tions is already warranted (Busby et al. 1996; ICTRT 2003; 
USOFR 2006). Climate-altered habitats may lead to rapid 
evolution (Barrett et al. 2010) or shifting range margins of 
myriad species (Hickling et al. 2005; Chen et al. 2011). In 
fishes, this may manifest as range contractions in cold envi- 
ronments or expansions in warm environments, and adap- 
tive versus neutral genetic divergence is likely to occur at 
differing spatial and temporal scales (Conover et al. 2006). 

Therefore, conservation strategies based heavily on neu- 
tral genetic variation, with an under- emphasis on the dis- 
tribution of non-neutral variation, risk detrimental impacts 
to locally adapted population segments and should be scru- 
tinized (Pearman 2001; Schwartz et al. 2009). We concur 
with Funk et al. (2012) that neutral and non-neutral ele- 
ments of differentiation are not mutually exclusive and 
should be used in concert to provide a cautious and consci- 
entious description of species diversity in a management 
framework. Monitoring trends between neutral and non- 
neutral differentiation and the corresponding degree of dis- 
parity observed over time may be fundamentally important 
for addressing impacts of climate change (i.e., adaptive 
responses). This will conceivably have a potential role in re- 
shaping conservation boundaries to safeguard species 
diversity. The design of our steelhead trout study follows 
this perspective; identifying selection candidate loci and 
environmental associations, then drawing comparisons 
with patterns of neutral population divergence. We 
observed non-neutral genetic heterogeneity of populations 
in association with environment. However, in our results, 
neutral diversity encapsulated overall steelhead trout diver- 
sity with better clarity and finer resolution across estab- 
lished DPSs. Nevertheless, monitoring of historical and/or 
contemporary non-neutral differentiation of populations 
adds a unique dimension to the characterization of these 
populations (Willing et al. 2010). The candidate markers 
identified in our study are expected to be useful for model- 
ing population level responses to climate change, and 
future population genomics approaches with thousands of 
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steelhead SNPs should provide improved estimation and 
resolution of adaptive differentiation in the CRB. 

There is a palpable consensus among researchers and 
managers cautioning against broad scale, rigid approaches 
to conservation, and acknowledging the need to address 
productivity limitations of myriad organisms in need of 
protection (Clostio et al. 2012). However, conflicting opin- 
ions on the role of genetics in conservation still pervade 
management agendas (Waples 1995; Fraser and Bernatchez 
2001; Garcia de Leaniz et al. 2007; Allendorf et al. 2010). 
Maintaining overall conservation unit viability must neces- 
sarily account for the viability of all demographically impor- 
tant population components within those units (Crandall 
et al. 2000; Latch et al. 2011), particularly where demo- 
graphic instability (e.g., genetic drift in small populations) 
may reduce overall adaptive variation or adaptive potential 
within regions (Kawecki and Ebert 2004; Schoville et al. 
2012). It is likely that the relevance and contribution of 
non-neutral variation is frequently overlooked or underuti- 
lized in conservation planning, but given recent calls for the 
incorporation of climate science in application of the ESA 
(e.g., McClure et al. 2013), the utility of such information 
should be highlighted. In the absence of efforts to regularly 
evaluate putatively adaptive population differences, there is 
presumably a greater risk of the loss of genetic diversity as 
climates and habitats continue to change through time. 
Over the long-term, the adaptive potential of many species 
across taxa will need to be further explored and considered 
in conservation planning. The real effects of a changing 
environment, including shifting ranges, may not be uni- 
formly realized or fit tightly into predefined units (e.g., 
ESU, DPS, MPG). 
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